Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.sglang.io/llms.txt

Use this file to discover all available pages before exploring further.

1. Model Introduction

LTX-2 and LTX-2.3 are video generation models from Lightricks. SGLang Diffusion supports the LTX series through native one-stage and two-stage pipelines for text-to-video and image-conditioned video generation. Use Lightricks/LTX-2 or Lightricks/LTX-2.3 as --model-path. For two-stage generation, SGLang uses the spatial upsampler and distilled LoRA components from the model snapshot by default. LTX-2.3 also supports the HQ two-stage variant.
License notice: LTX-2 and LTX-2.3 are released under the LTX-2 Community License Agreement, not Apache 2.0. The license includes commercial-use restrictions for some entities. Review the official Lightricks license before production or commercial use; SGLang support does not grant additional model usage rights.

2. SGLang-diffusion Installation

Install SGLang with diffusion dependencies:
uv pip install "sglang[diffusion]" --prerelease=allow
For platform-specific setup, see the SGLang Diffusion installation guide.

3. Model Deployment

This section provides deployment configurations optimized for different LTX pipelines and hardware targets.

3.1 Basic Configuration

The LTX series supports one-stage and two-stage pipelines. LTX-2.3 also supports the HQ two-stage pipeline. The recommended launch configuration depends on whether the target GPU can keep both two-stage DiTs resident. Interactive Command Generator: Use the configuration selector below to generate a deployment command. The default selection targets a single NVIDIA H200 with resident two-stage mode. For multi-GPU serving, start from the 2-GPU or 4-GPU presets and only change parallelism if you need more memory headroom.

3.2 Configuration Tips

Choose the pipeline class based on the quality and latency target:
Use casePipeline classNotes
One-stage generationLTX2PipelineFastest LTX native path. Supports T2V and TI2V.
Two-stage generationLTX2TwoStagePipelineUses a base stage and a refinement stage. Supported by LTX-2 and LTX-2.3.
Two-stage High Quality (HQ) generationLTX2TwoStageHQPipelineLTX-2.3 HQ path; defaults to 1920x1088 unless you override --width and --height.
Feature compatibility:
Pipeline classT2VTI2V (--image-path)LoRA (--lora-path)Notes
LTX2PipelineYesYesYesOne-stage path. Cannot be combined with HQ because HQ is a separate two-stage pipeline class.
LTX2TwoStagePipelineYesYesYesStandard two-stage path for LTX-2 and LTX-2.3.
LTX2TwoStageHQPipelineYesYesYesHigh Quality two-stage path for LTX-2.3. Use this instead of LTX2Pipeline; it is not a one-stage mode flag.
For two-stage pipelines, --ltx2-two-stage-device-mode controls transformer residency:
ModeWhen to use it
snapshotRecommended default. Balances latency and VRAM.
residentBest latency on high-VRAM GPUs because both DiTs can stay resident.
originalClosest to the original two-stage switching semantics.
Other deployment flags:
  • --lora-path: Preload a community LoRA adapter.
  • --lora-weight-name: Select the exact safetensors file when the LoRA repository contains multiple weight files.
For native LTX-2.3 two-stage serving without a user LoRA, resident is the fastest high-VRAM path. When you pass --lora-path, SGLang still applies the user LoRA during the two-stage switch, so use resident on H200-class GPUs for enough VRAM, but do not expect the same premerged-stage2 benefit as the no-user-LoRA path.

3.3 Fast multi-GPU presets

For latency-oriented LTX serving, prefer CFG parallel over sequence parallelism. CFG parallel splits guidance branches across GPUs, while SP/Ulysses is mainly a memory/long-sequence tool for LTX.
TargetRecommended server flagsNotes
1 high-VRAM GPU--ltx2-two-stage-device-mode residentFastest two-stage setup when both DiTs fit.
1 standard GPU--ltx2-two-stage-device-mode snapshotLower VRAM than resident; use this when H100-class memory is tight.
2 GPUs--num-gpus 2 --enable-cfg-parallel --ltx2-two-stage-device-mode residentFastest common 2-GPU setup.
4 GPUs--num-gpus 4 --tp-size 2 --enable-cfg-parallel --ltx2-two-stage-device-mode residentFastest common 4-GPU layout: TP2 inside each CFG branch.
Official comparison--ltx2-two-stage-device-mode originalUse this only when matching the original stage-switch semantics matters.
Use --enable-cfg-parallel for degree-2 CFG parallel. Use --cfg-parallel-size only when you explicitly need a different CFG branch count. If resident exceeds available VRAM, keep the same parallelism preset and switch only the device mode to snapshot. On high-VRAM GPUs, add --text-encoder-cpu-offload false if text encoding latency matters and you have enough memory.

3.3.1 Two GPUs

sglang serve \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2TwoStagePipeline \
  --num-gpus 2 \
  --enable-cfg-parallel \
  --ltx2-two-stage-device-mode resident

3.3.2 Four GPUs

sglang serve \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2TwoStagePipeline \
  --num-gpus 4 \
  --tp-size 2 \
  --enable-cfg-parallel \
  --ltx2-two-stage-device-mode resident

4. Model Invocation

4.1 Basic Usage

The examples below spell out the current SGLang sampling defaults for reproducibility:
Model pathDefault outputDefault framesDefault steps
Lightricks/LTX-2768x51212140
Lightricks/LTX-2.3768x51212130
Lightricks/LTX-2.3 with LTX2TwoStageHQPipeline1920x108812115

4.1.1 LTX-2 one-stage text-to-video

sglang generate \
  --model-path Lightricks/LTX-2 \
  --pipeline-class-name LTX2Pipeline \
  --prompt "A quiet coastal town at sunrise, fishing boats moving slowly through golden mist, cinematic camera movement" \
  --save-output

4.1.2 LTX-2.3 one-stage text-to-video

sglang generate \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2Pipeline \
  --prompt "A quiet coastal town at sunrise, fishing boats moving slowly through golden mist, cinematic camera movement" \
  --save-output

4.1.3 LTX-2 two-stage text-to-video

sglang generate \
  --model-path Lightricks/LTX-2 \
  --pipeline-class-name LTX2TwoStagePipeline \
  --prompt "A handheld shot follows a red tram crossing a rainy city square at night, reflections on the pavement, cinematic lighting" \
  --save-output

4.1.4 LTX-2.3 two-stage text-to-video

sglang generate \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2TwoStagePipeline \
  --prompt "A handheld shot follows a red tram crossing a rainy city square at night, reflections on the pavement, cinematic lighting" \
  --save-output

4.1.5 LTX-2.3 HQ text-to-video

sglang generate \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2TwoStageHQPipeline \
  --prompt "A wide cinematic shot of alpine clouds rolling over a mountain ridge, soft morning light, slow aerial camera movement" \
  --save-output

4.1.6 Image-to-video with one reference image

Pass one image to --image-path for image-conditioned generation:
sglang generate \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2TwoStagePipeline \
  --image-path ./inputs/start.png \
  --prompt "The camera slowly pushes forward as the subject turns toward warm window light, subtle natural motion, cinematic" \
  --save-output

4.1.7 First-to-last-frame transition with two reference images

Pass two images to --image-path for transition-style TI2V. The first image is used as the starting condition and the second image is used as the ending condition.
sglang generate \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2TwoStagePipeline \
  --image-path ./inputs/start.png ./inputs/end.png \
  --prompt "A smooth cinematic transition from the first scene into the final scene, dynamic camera motion, motion blur, zhuanchang" \
  --save-output

4.2 Advanced Usage

4.2.1 Use community LoRAs

Use --lora-path to load a LoRA adapter. If the Hugging Face repo contains multiple safetensors files, use --lora-weight-name to select the exact file. --lora-scale maps to the standard LoRA merge scale and defaults to 1.0. The following example uses valiantcat/LTX-2.3-Transition-LORA:
sglang generate \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2TwoStagePipeline \
  --lora-path valiantcat/LTX-2.3-Transition-LORA \
  --lora-weight-name ltx2.3-transition.safetensors \
  --prompt "A low-angle tracking shot moves through a foggy forest road. The camera rises above the treetops and transitions into a clear view of a snowy mountain peak under bright sunlight, zhuanchang" \
  --save-output
You can combine the Transition LoRA with two reference images:
sglang generate \
  --model-path Lightricks/LTX-2.3 \
  --pipeline-class-name LTX2TwoStagePipeline \
  --image-path ./inputs/start.png ./inputs/end.png \
  --lora-path valiantcat/LTX-2.3-Transition-LORA \
  --lora-weight-name ltx2.3-transition.safetensors \
  --prompt "A fast cinematic transition from the first image to the second image, whip-pan motion, atmospheric lighting, zhuanchang" \
  --save-output
Some community LoRAs only include weights for transformer blocks. In that case, SGLang logs a concise coverage summary and leaves unmatched LoRA-capable layers on the base model weights. This is expected when the adapter format intentionally omits those layers.

5. Practical Tips

  • Use --pipeline-class-name LTX2TwoStagePipeline as the default LTX two-stage quality path.
  • Use --pipeline-class-name LTX2TwoStageHQPipeline when you want the HQ path and have enough VRAM for larger outputs.
  • Use --ltx2-two-stage-device-mode resident on high-VRAM GPUs if latency matters more than memory usage.
  • Use --ltx2-two-stage-device-mode original when comparing against official two-stage behavior.
  • Keep --width and --height aligned with the target model resolution; for LTX models, these are output video dimensions.